Graph-Based Representation and Reasoning by Unknown

Graph-Based Representation and Reasoning by Unknown

Author:Unknown
Language: eng
Format: epub
ISBN: 9783030231828
Publisher: Springer International Publishing


4.1 Datasets and Evaluation Metrics

Datasets. Statistical properties of datasets are described in Table 1.

Movielens. A movie rating dataset of the GroupLens project at the University of Minnesota [8]. It can be downloaded from www.​grouplens.​org. Each row includes the id of a user, the id of an item and rating on range [1, 5]. Training and testing part are extracted with ratio 0.95 : 0.05, then we filter the testing by removing ratings which have strange user or item to the training.

FPT PLAY. This is taken from history of users on fptplay.vn [9], our online service that allows customers to watch a wide variety of movies, TV shows and more. We use 40-day history, where the testing part is consists of the last 10-day activities of all users on the system and applied the filtering as on Movielens testing part.

Evaluation Metrics

Popularity. This reflects how popular items are on result lists. In DWC, assuming that new items have low counting values, this metric is called as novelty and minimized by algorithms. We suppose that low popularity and novelty are not the same and do not optimize this value. To calculate, for a target user i and list of L recommended items , his or her popularity is: . The popularity of an algorithm is taking average of all users.

Diversity. Diversity shows how different the suggesting outcomes are between users. The higher value, the more personalized. It is also related to popularity and accuracy. Bringing mostly popular items is a safe strategy for fitting customer interest, but leads to low difference between users. In contrast, using more unpopular products enhances personalization capability but has a high risk of satisfying customers.

Let call is the number of distinct objects between results of user i and j. Diversity is computed by averaging of all pairs of users.

Coverage. This metric demonstrates the efficiency of resource usage by a recommendation algorithm. It is equal to the rate of items which are recommended to at least one user on the total resource (Table 4).

Congestion. [20] introduced the congestion problem and proposed this metric. Congestion occurs when a few distinct items are in suggesting lists of numerous users. Therefore, the lower is the better. It can be seen as hard version of coverage. Coverage just checks whether an item is in result of any user, while congestion compares the number of times it is recommended.

Precision. Precision is one of accuracy metrics, which are always the first priority. If a method has bad accuracy, no need to consider other properties. It is taken by counting the number of items that are correctly given to a user, then getting average of these values of all users.

Rating. Rating is a score pointing out how users are satisfied with content of items. In this paper, together with precision, rating is another accuracy metric, while other methods use it as a feature. Our purpose is to validate whether the model exploits low-degree items but distinguishes between novelty and bad quality. It is measured by averaging scores of all items recommended for each user first, then taking average these values of all users.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.